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Abstract 

In a recent paper [12] , Hou and Shi introduced a new adaptive data analysis method 
to analyze nonlinear and non-stationary data. The main idea is to look for the sparsest 
representation of multiscale data within the largest possible dictionary consisting of 
intrinsic mode functions of the form {a(t)cos(0(i))}, where a 6 V(0), V(9) consists of 
the functions smoother than cos(6*(t)) and 6' > 0. This problem was formulated as a 
nonlinear L° optimization problem and an iterative nonlinear matching pursuit method 
was proposed to solve this nonlinear optimization problem. In this paper, we prove the 
convergence of this nonlinear matching pursuit method under some sparsity assumption 
on the signal. We consider both well-resolved and sparse sampled signals. In the case 
without noise, we prove that our method gives exact recovery of the original signal. 

1 Introduction 

Developing a truly adaptive data analysis method is important for our understanding of 
many natural phenomena. Although a number of effective data analysis methods such as 
the Fourier transform or windowed Fourier transform have been developed, these meth- 
ods use pre-determined basis and are mostly used to process linear and stationary data. 
Applications of these methods to nonlinear and nonstationary data tend to give many 
unphysical harmonic modes. To overcome these limitations of the traditional techniques, 
time- frequency analysis has been developed by representing a signal with a joint function of 
both time and frequency [10]. The recent advances of wavelet analysis have led to the devel- 
opment of several powerful wavelet-based time- frequency analysis techniques fTS ] IH] [20 ] [T8] . 
But they still cannot remove the artificial harmonics completely and do not give satisfactory 
results for nonlinear signals. 
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Another important approach in the time-frequency analysis is to study instantaneous 
frequency of a signal. Some of the pioneering work in this area was due to Van der Pol |26j 
and Gabor [11], who introduced the so-called Analytic Signal (AS) method that uses the 
Hilbert transform to determine instantaneous frequency of a signal. However, this method 
works mostly for monocomponent signals in which the number of zero-crossings is equal 
to the number of local extrema pQ. There were other attempts to define instantaneous 
frequency such as the zero-crossing method [23l [231 IS] an d the Wigner-Ville distribution 
method [H [161 [22"1 \10\ \15\ I21j . Most of these methods suffer from various limitations. 
For example, the zero-crossing method cannot be applied to study a signal with multiple 
components and is sensitive to noise. On the other hand, the methods based on the Wigner- 
Ville distribution suffer from the interference between different components. 

More substantial progress has been made recently with the introduction of the Empirical 
Mode Decomposition (EMD) method [13]. The EMD method decomposes a signal into a 
collection of intrinsic mode functions (IMFs) sequentially through a sifting process. On the 
other hand, since the EMD method relies on the information of local extrema of a signal, 
it is unstable to noise perturbation. Recently, an ensemble EMD method (EEMD) was 
proposed to make it more stable to noise perturbation [27] . But some fundamental issues 
remain unresolved. 

Inspired by EMD /EEMD and the recently developed compressive sensing theory [61 
[SI [2], Hou and Shi proposed a data-driven time- frequency analysis method in a recent paper 
[12| . The main idea of this method is to look for the sparsest decomposition of a signal 
over the largest possible dictionary consisting of the intrinsic mode functions (IMFs). The 
dictionary is chosen to be: 

V = {acos6: a G V(0), 9' G V{9), and 9'{t) > 0, Vt G M.} , (1) 

where V{9) is a collection of all the functions that are smoother than cos 9(t). In general, it is 
most effective to construct V{9) as an overcomplete Fourier basis in the #-space. For periodic 
signals, we can simply choose V{9) as the standard Fourier basis in the #-space. Then the 
problem can be reformulated as a nonlinear version of the L° minimization problem. 

P : Minimize M 

( a k)l<k<M ,(0fc)l<fc<M 

( M (2) 

Subject to: / = 

I a/c cos Ok G T>, k = 1, • • • , M. 

The constraint / = YlkLi a k cos &k can be replaced by an inequality when the signal is 
polluted by noise. This kind of optimization problem is known to be very challenging to 
solve since both and 9k are unknown. Inspired by matching pursuit [HIES], Hou and 
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Shi [12] proposed a nonlinear matching pursuit method to solve this nonlinear optimization 
problem. The basic idea is to decompose the signal sequentially into two parts, the mean 
plus a modulated oscillatory part with zero mean: 

/ = ao + a± cos 9, (3) 

where the mean ao, the envelope a±, and the phase function 9 are all unknown. We call 
til cos # an Intrinsic Mode Function (IMF). After this decomposition is completed, we can 
treat ao as a new signal and repeat this process until the residual is small enough. 

The objective of this paper is to analyze the convergence of the data-driven time- 
frequency analysis method proposed by Hou and Shi in [12] for periodic signals. We assume 
that the signal / has a sparse representation over the Fourier basis in the 0-space for some 
unknown 9. The main objective of our data-driven time-frequency analysis is to design an 
iterative algorithm to find such 9. With a given approximate phase function 9 n , we solve a 
I 1 minimization problem to obtain the Fourier coefficients of / in the # n -space: 

min||x||i, subject to $gnx = f, (4) 

X 

where each column of matrix is a Fourier basis in the # n -space. We then use this 
coefficient x to update 9 n , and repeat this process until it converges. 

When the signal has sufficiently well-resolved samples, the I 1 optimization problem @ 
can be solved very efficiently by interpolation and Fast Fourier Transform (FFT). In this 
case, the constraint <&Qnx = f becomes a well-posed deterministic linear system provided 
that the coefficient matrix §gn is invertible. The linear optimization problem is then reduced 
to solving this linear system. Since the matrix <&qu consists of the Fourier basis in the 9 n - 
space, the corresponding linear system can be solved approximately by first interpolating / 
to a uniform mesh in the # n -space and then applying FFT. This gives rise to a very efficient 
algorithm with complexity of order 0(Nlog(N)), where N is the number of sample points 
of the signal. Details of this algorithm will be given in Section [2j 

Our first result is for well-resolved periodic signals of the form f(t) = ao(i)+ai(t) cos 9(t). 
We ignore the interpolation error and assume that f(t) is given for all t G [0, T]. We 
further assume that the instantaneous frequency 9'(t) has a sparse representation in the 
Fourier basis in the physical space given by { e i2fc,r */ T , | jfe | < M }, a and m have a sparse 
representation in the Fourier basis in the normalized #-space given by |e* 2fc7r ^, \k\ < Mij, 

where 9 = wpjz^m 1S the normalized phase function. Then we can prove that the iterative 
algorithm will converge to the exact solution under some mild scale separation assumption 
on the signal. More precisely, if the initial guess of 9 satisfies 

^((9° -9)') \\t <ttM /2, (5) 
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where J- is the Fourier transform in the physical space, then there exists r)§ > such that 



provided that L > 770, where t/q is a constant determined by Mq, M\ and L = * 27r • 
We remark that 1/L is a measure of the smallest scale of the signal. The scales of Oq, ax, 
and 9 are measured by 1/Mx and 1/Mo respectively. The requirement L > 770 is actually a 
mathematical formulation of the scale separation property. 

The key idea of the proof is to estimate the decay rate of the coefficients over the Fourier 
basis in the #™-space, where 9 n is the approximate phase function in each step. We show 
that the Fourier coefficients of the signal in the # n -space have a very fast decay as long as 
that n is a smooth function. Using this estimate, we can show that the error of the phase 
function in each step is a contraction and the iteration converges to the exact solution. 

In many problems, a signal may not has an exact sparse representation. A more general 
setting is that the Fourier coefficients of do, ax, and 9' decay according to some power law 
as the wave number increases. We can prove that in this case, our method will converge to 
an approximate solution with an error determined by the truncated error of do, ax and 9'. 
The detailed analysis will be presented in Section 12.21 

For signals with sparse samples, we can also prove similar convergence results with an 
extra condition on the matrix <3?6>n. In this case, we need to use the I 1 minimization even 
with periodic signals. Suppose S is the largest number such that 833(^9") + ^64,3(^9") < 2. 
Under the same sparsity assumption on the instantaneous frequency, the mean and the 
envelope as before, we can prove that there exist t)l > 0, rjs > 0, such that 



provided that L > T]l and S > rjg- Here the columns of the matrix consist of the Fourier 
basis in the # n -space, 8s(A) is the S'-restricted isometry constant of matrix A given in 
[3]. Further, we show that if the sample points {ij}^fi are selected at random from a 



well-known theorem for the standard Fourier basis in [TJ. 

The rest of the paper is organized as follows. In Section 2, we establish the convergence 
and stability of our method for well-resolved signals. In Section 3, we propose an algo- 
rithm for signals with sparse samples and prove its convergence and stability. In Section 4, 
some numerical results are presented to demonstrate the performance of the algorithm and 
confirm the theoretical results. Some concluding remarks are made in Section 5. 




(6) 




(7) 



set of uniformly distributed points the condition 63s {&o n ) + ^4s(^e n ) < 2 holds 

with an overwhelming probability provided that S < CA^ s /(max(^ (logiVb) 6 ) and Nf > 

max{C||# ||iiVft, 2Mo}, where N s is the number of the samples, Nf, is the number of the 
basis. If Mq = 0, which implies that 9 =1, then the above result is reduced to the 
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2 Well resolved periodic signal 



In this section, we will analyze the convergence and stability of the algorithm proposed 
in [TJ] for well-resolved signals. By well-resolved signals, we mean that that these signals 
are measured over a uniform grid and can be interpolated to any grid with very little loss 
of accuracy. In the analysis, we assume that the signal is periodic in the sample domain. 
Without loss of generality, we assume that the signal / is periodic over [0, 1] . 

In order to make this paper self-contained, we state the algorithm proposed in [12] . The 
signal / is given over a uniform grid tj = j/N for j = 0, N — 1. 

• el = e , n = o. 

• Step 1: Interpolate Tk—i from the uniform grid in the time domain to a uniform mesh 
in the ^-coordinate to get r^T 1 and compute the Fourier transform ran 1 ' 



Interpolate far*"" 1 , , (8) 

where 9% ., j = 0, • • • , N— 1 are uniformly distributed in the ^-coordinate, i.e. ■ = 
2nLgn j/N. Apply the Fourier transform to r^n 1 as follows: 

1 N 

%^«) = xT, r fcy i2 ™ Kj > w = -JV/2 + l > ---,JV/2 I (9) 
j'=i 

where 6'Z a = "V, 

K ,J ZuLn 



Step 2: Apply a cutoff function to the Fourier Transform of r^„ 1 to compute a and b 
on the mesh in the ^-coordinate, denoted by agn and bgn: 



k 

k '"k ' 



T- 1 
agn - J- R 



(fjn 1 (u> + L*n) + fj» 1 (lo - L «)) • x (w/ifljf)] , (10) 



bg™ - T e } 

where T~ x is the inverse Fourier transform defined in the 6% coordinate: 
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N / 2 

^i{% 1 ) = N S ^n 1 ^--, J = 0,---,iV-l, (12) 

w =-Ar/2+l 

and x is the cutoff function, 



, , J 1, -1/2 < a; < 1/2, 

X( w ) = ^ (I 3 ) 
0, otherwise. 
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Step 3: Interpolate and bg™ back to the uniform mesh in the time domain: 

a™ +1 = Interpolate (0% tj ,ae*,u\ , * = 0, • • • , N - 1, (14) 
b n k +1 = Interpolate (&k,j,b^,ti\ , i = 0, • • • , N - 1, . (15) 

Step 4: Update 8 n in the i-coordinate: 



d b 



A9' = P V n, — arctan 



) ) , A9(t) = J* A9'(s)ds, 9 n+1 = 9 n + f3A9, 



where j3 E [0, 1] is chosen to make sure that 9^ +1 is monotonically increasing: 

/3 = max ja e [0,1] : ^(9% + aA9) > o| , (16) 

and Pv M() i s the projection operator to the space Vm = span | e * 2fc7r */ T ; ^ = —Mo, • • • , 0, 
and Mq is chosen a priori. 

• Step 5: If — lb < £0j stop. Otherwise, set n = n + 1 and go to Step 1. 

In the previous paper [12], we demonstrated that this algorithm works very effectively 
for periodic signals and is stable to noise perturbation. In this paper, we will analyze its 
convergence and stability. Our main results can be summarized as follows. For periodic 
signals that have an exact sparsity structure, we can prove that the above algorithm will 
converge to the exact decomposition. For periodic signals that have an approximate sparsity 
structure, the above algorithm will give an approximate result withe accuracy determined 
by the truncated error of the signal. In the following two subsections, we will present these 
two results separately. 

2.1 Exact recovery 

In this section, we consider a periodic signal f(t) that has the following decomposition: 

f(t) = fo(t) + h(t) cos 8(t), h(t) > 0, 9'{t) > 0, t e [0,1], (17) 

where /o,/i and 9 are the exact local mean, the envelope and the phase function that we 
want to recover from the signal. 

First, we introduce some notations. Let L = be the number of period of the 

signal which is a measurement of the scale of the signal. 9 = 2nL ' is the normalized phase 
function, which is used as a coordinate in our numerical method and analysis. fo,g(k), fi t e(k) 
are the Fourier coefficients of /o, /i in the ^-coordinate, i.e. 

fo,e(k) = [ fo e- i2 "™$, fi,e{k) = f f\ e~ i27Tk ®d9, (18) 
Jo Jo 
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We also use the notation Fe{ ) to represent the Fourier transform in the 0-space and 
to represent the Fourier transform in the original t-coordinate. 
Now we can state the theorem as follows: 

Theorem 2.1. Assume that the instantaneous frequency 9' is M^-sparse over the Fourier 
basis in the physical space, i.e. 

0' € V Mo = span { e i2k ^ T , k = -M , ■ ■ ■ , 1, ■ ■ ■ , M } . (19) 

Further, we assume that the local mean and the envelope f\ are M\-sparse over the 
Fourier basis in the 9 -space, i.e. 

foM = Mk) = 0, V|fc|>Mi. (20) 
// the initial guess of 9° satisfies 

\\F ((0° - 0)') 111 <^M /2, (21) 
then there exist tjq > such that 

II j- ((<r+i _ | k < 1 1| j- ( (e m _ ^ > (22) 

provided that L > t/q. 

Before giving the rigorous proof, we introduce some notations for the convenience of the 
representation. Let 9 m be the approximate phase function in the mth step, and A9 m = 
9 — 9 m be the error of the phase function in the current step. Let a m , b m be the approximate 
envelope functions, which are obtained by using the algorithm in Step 3. Further, we define 
a m = f 1 cos A9 m , b m = h sin A9 m , and Aa m = a m -a m , Ab m = b m -b m . The quantities 
a m and b m can be considered as the "exact" envelope functions at the mth iteration since 
A9 m = arctan(^). Thus, we would obtain the exact phase starting from 9 rn in one 
iteration. In our analysis, we need to establish a relation among Aa m , Ab m and A9 m . 

One key ingredient of the proof is to estimate the integral e l2,r M-^ ) a ]Q rn '. Fortu- 
nately, for this type of integral, we have the following lemma. 

Lemma 2.1. Suppose <j>'(t) > 0, t G [0,1], <f>(0) = 0, 0(1) = 1, and ip',4>' G V Mo = 
span {e t2klTt , k = —M , • • • , M }. Then we have, 



e ii(> e -i2irui<t> t 



\ mm <h ' / U ^ — -v ■ ^. ^ 



provided that e l ^e l2lTW 'l > is a periodic function. Here P(x, n) is a (n— l)th order polynomial 
of x and the coefficients also depend on n. 
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Remark 2.1. Regarding the polynomial P(x, n), we can get an explicit expression for small 
n. For example, when n = 2, we have 



d 2 



7/2 



/3 + 'i)2 |C 



< 



J2 



+ 



< 



< 



max | ip" | max | if) 1 | max | <j)" | (max | ip 



y3 
n\2 



+ 



12 



h'2 



(min <f>'Y 
1 

(min^') 2 



+ 



(min <j)') 



+ 



mm i 



1 + 



mm 



(24) 



where we have used AO, 9 £ Vm in deriving the last inequality. Then, we have P(x, 2) = 
x + 1. Similarly, we can also get P(x, 3) = 3x 2 + 4x + 3. 

Remark 2.2. Lemma \2. 1\ is valid for any n G N. The integral that we would like to estimate 
in Lemma \2.1\ is actually the Fourier transform of e 1 ^ . Since tp is a smooth function, we 
expect that the Fourier transform of e 1 ^ has a rapid decay for \uj\ large. In Lemma \2.1l 
we give a more delicate decay estimate of the Fourier transform of e 1 ^ . Such estimate is 
required in our proof of Theorem \2.1\ 



Proof. Using integration by parts, then we have 



/' 

Jo 



|27ru;| 



i ^n( e #) 



- i2TTLU 



< 



1 



max 



|27T£j| n te[o,i] 



d n (e^) 



Since e W e -a™<l> i s 

periodic, there is no contribution from the boundary terms when per- 
forming integration by parts. Using the fact that, ip',(f)' G Vm and Mg 6 Vm , we obtain 

max|^)(i)| < ^1(2^-^(^)1 <(2 7 rAf r- 1 ^|?(A ; )| = (27rAfor- 1 ||P|| 1 . (25) 

k k 

Direct calculations give 

cT(e^) 



P 



< 



WW 



-,n 



(min 



1 3=1 



(26) 



Thus, we get 



p 



< 



|u;|™ (min (f>') r 



o) j \m\{- 



This proves Lemma 12. 11 

Now we are ready to prove Theorem 12.11 



(27) 



□ 



S 



Proof, of Theorem \2.1\ 

First, we need to establish the relation between A9 m+1 and Aa m , Ab m . 

Recall that A6 m = arctan (^) . Thus, we have Aft = Aft m -arctan (j^j = arctan 

arctan ( §sr ) • Using the differential mean value theorem, we know that there exists £ 6 [0, 1], 



such that 



AO 




fb m \ 
arctan — 
\a m J 


(b m \ 
— arctan t^— 
\a m J 




(a m + £Aa m )Afc m 


- (b m + £Ab m )Aa m 






(a' m + £Aa m ) 2 


+ (6 m + £A6 m ) 2 




< 


(\a m \ + \Aa m \) 


\Ab m \ + {\b m \ + 


|A6 


m \)\Aa m \ 






((a m ) 2 + {b m 


) 2 )/2- ((Aa m ) 2 


+ (A6 m ) 2 ) 





< L»i|Aa m | +L> 2 |A6 r ' 



where 



D\ = max 



/i + |A6" 



/ 2 /2 - ((Aa m ) 2 + (Ab m ) 2 ) 

2 



Do 



max 

t 



h + \Aa r - 



fljl - ((Aa m ) 2 + (Ab m ) 2 ) 



(28) 



(29) 



and we have used the relations that f\ = (a m ) + (b m ) and \a m \, \ b m \ < f\. 

In the algorithm, there is another smooth process when updating ft, which gives the 
following result for Aft m+1 , 



A9 m+1 = 2vrAL m+1 t + Aft 



p,M ) 



(30) 



where A9 P; m = Pv Mo (^AftpJ 1S the projection of A9 P over the space Vm , Aft p and 
27rAL m+1 i are the periodic part and the linear part of Aft respectively: 



A9 = 2vrAL m+1 t + A9 p . 
Using (|30p . we can estimate (A9 m+l )' as follows, 



(31) 



\F ((Aft m+1 )') \\ % < 2vrAL m+1 + 



A9 



p,Mo 



< 2vrAL + M 



A9. 



p,Mo 



< 2HA0HOO +M 2 



Aft r 



< (3M 2 +2) || Aft | 



where we have used the fact that 



2vr|AL m+1 | = |Aft(l)- Aft(0)| <2||A0| 



A9 T 



+ 2ttAL < 3 



A9 



Combining (J32J) with (|2gj). we get 

||jr((A0 m + 1 ) / )|| 1 < (3M 2 + 2) (Z) 1 ||Aa m || 00 + D 2 ||A6 r ' 



(32) 



(33) 
(34) 



(35) 
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Next, we will establish the relation among Aa m , Ab m and A9 m . This can be done by 
estimating the Fourier coefficients of a™ - , b™ in the # m -space. 

In Appendix A, we derive the following estimates of Aa m and Ab m (see (|135j) . (|136p ). 



|Aa m | <2 Yl f°M k ) + E 



O0" 



(k)\ + b em (k) ) + |a*»(*)l> ( 36 ) 



±L m <fc<f L" 



|L" l <fc<§L" 



l*l>^ 



A6 m |<2 £ foMV + E (fa"*(*OI+ &*»(*)) + E |^(fc),(37) 



§L m <fc<fi> 



where /o,e m 5 and are the Fourier transform of /o, a m and 6 m in the # m -space. 

To obtain the desired estimates, we need to use Lemma 12.11 to estimate the Fourier 
coefficients of fo, a m ,b m in the # m -space. In an effort to make the proof concise and easy 
to follow, we defer the derivation of the estimates (|38p . (|39p and (|40p to Appendix B. The 

II 7-\( A# m VI IN 

main results of Appendix B are summarized as follows. As long as 7 = - — 2 7rM — V^j 
we have 



|/o,flm(w)| < C Q ( y ) M "M l7 , V|w| > L/2 



(38) 



< 4C Q ( y ) M n (2M! + 1)7, V|w| > L/2. 



(39) 



where 



\bf m (u)\ < 4C Q ( y ) M "(2Mx + 1)7, V|w| > L/2. 



Q 



P(z,n) 



yiiii „, ii^KAmiii 



mm 



-77m . , 5 



min(# )' 



7 



2vrM n 



(40) 



(41) 



Using (j36j) - ([4T)]) and the fact that YlT=i ^ " converges as long as n > 2, we conclude 



that 



|Aa m | < T Q(aLr n+1 j, 
|A6 m | < r Q(aL)-" +1 7 , 



(42) 
(43) 



where Tq is a constant that depends on Mo, Mi and n. It follows from (|35|) . (|4ip . (|42j) and 
(1431) that 



(44) 



||^((A0 m+1 ) / )|| 1 < ri(Di + D 2 )Q(aLy n+1 (jj 7 ((A6> m )') 
where T\ is a constant that depends on Mq, Mi and n. 
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To complete the proof, we need to show that there exists a constant t]q > which does 
not change in the iterative process, such that /3 = T\(Di + D2)Q{aL)~ n+1 < 1/2 provided 
that L > rjo. This seems to be trivial, simply choosing 770 = — (2T\(Di + D2)Q) 1 ^ n ~ 1 ' 
would make (3 < 1/2 provided that L > ijq. The problem is that D\,D2,Q,a vary during 
the iteration. We need to show that they are uniformly bounded during the iteration. 

It is relatively easy to show that a is bounded, 



1 - a 



(1) - 9 m (0) 



6(1) -6(0) 



A6 m (l) - A6 m (0) 



2vrL 



< iKA^yiloo < ^[(A^yiiix < Mo 



which implies that 7/8 < a < 9/8, provided that L > 2M and 7 < 1/4. 

It is more involved to show that Q is bounded. We need to first estimate |(# m )'| and 

\(r i )'\ = \6' - (A6 m )'/(27TL m )\ >l(0'- m(A6 m )']\\i/(27TL)) >l(#- ||) ,(45) 



and 



m(T)% = -||^-^[(A^) / ]/(2^L)|| 1 < - (||?|| 1 + ||J-[(An / ]lli/(2^) 
a a \ 



< ~ (||^||i + M /(4L)j 



(46) 



where we have used the assumption that 7 < 4. If L satisfies the following condition 



then we can get 



^<2min(6>'), (47) 



tfin'Up, 11^)1111 <y (48) 



where we have used the fact that min(6 ) < max(# ) < \\6 ||i. It follows from (|48p that the 
term z defined in (|4ip is uniformly bounded, 

z < z , (49) 

where zq is a constant depending on 6 . 

Based on the above estimation of z, the term Q in (j41|) can be bounded by a constant, 

P(z,n) f9\ n P(z ,n) 
Q = 7 \n < 7 7 =7v^ = Qo, (50) 



min(# yj \ / (mm 6 
where Qq is a constant that depends on and n. 
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We now proceed to bound D\ and D2- Note that if |Aa m |, |A6 m | < ^min/i, we can 
bound D\ as follows: 



D\ = max 
< max 



+ |A6 r * 



(( m)2 + (&m)2)/2 _ ((Aa m ) 2 + (A6 m ) 2 ) 
l/i| + |A6 m | 



(/!) 2 /2 - ((Aa™) 2 + (A& 



<ii^ = Eo . (5i) 

mm ji 

Similarly, we can show that D2 < -Eo- 

It is not difficult to see that the condition | Aa m \ , \ Ab m \ < min j\ is valid if L satisfies 

T Q (7L/8)- n+1 < v^min/x, (52) 

since we have 

|Aa| < r Q(aL)- n+1 7 < Jr Q (7L/8)- n+1 , (53) 

|A6| < T Q(aL)- n+1 7 < h Q (7L/8)' n+1 , (54) 

where we have used a > 7/8, Q < Qq, the assumption 7 < 4 and the estimates ([36]) . (J3TJ) . 
Finally, we derive the following estimate for the error of the instantaneous frequency, 

||^((A0 m+1 ) / )|| 1 < (3 \\T ((AO™)')^ , (55) 

where f3 = ri£oQo(7-£/8)~ n+1 , ^1 is a constant depends on Mq,Mi,ti, Eq depends on 
min/i, and Qo depends on o' and n. 

Now, we prove that if 7 = ^^il^ ^ < |, then we have 

||^((A^+ 1 )')|| 1 <^||-F((An , )lli' ( 56 ) 
as long as L satisfies the following conditions 

L>4Mi, ^ < min ji,2min(7/) j , (57) 

r Q (7L/8)-" +1 < V^min/x, (58) 
r 1 ^ Qo(7L/8)™- 1 < i (59) 

It is obvious that there exist r/Q > 0, such that conditions ([5?]) - ([59]) are satisfied provided 
that L > 770. Here 770 is determined by Mo , M\ , , min f\ and n which does not change 
during the iteration process. 
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By induction, it is easy to show that if initially 

2ttM ~ 4' 

then there exists 770 > which is determined by Mo, M\,9 , min /1 and n, such that 

(^((A^+^II^^^^AnOII^ (60) 

as long as L > t]q. This completes the proof of Theorem 12.11 □ 

Remark 2.3. The above proof is valid for any n>2. Note that rjo depends on n. Theoret- 
ically, there exists an optimal choice of n to make rjo the smallest. By carefully tracking the 
constants in the proof, we can show that as n going to +00, rjo tends to SC(n) 1 ^ n ~ 1 ' Mq, 
where 5 is a constant independent on n, and C(n) is the maximum of the coefficients of 
polynomial P(x,n) appears in Lemma \2.1\ We conjecture that C(n) 1 ^ n ~ 1 ^ is bounded for 
n>2. If this is the case, then rjQ is proportional to Mq. 

Remark 2.4. Classical time- frequency analysis methods, such as the windowed Fourier 
transform or wavelet transform, in general cannot extract the instantaneous frequency ex- 
actly for any signal due to the uncertainty principle. For a single linear chirp signal without 
amplitude modulation, the Wigner- Ville distribution can extract the exact instantaneous fre- 
quency, but it fails if the signal consists of several components. Theorem \2.1\ shows that our 
data-driven time-frequency analysis method has the capability to recover the exact instanta- 
neous frequency for a much larger range of signals. 

2.2 Approximate recovery 

If the signal does not have an exact sparsity structure in the #-space as required in Theorem 
12.11 our method cannot reproduce the exact decomposition. But the analysis in this sub- 
section shows that we can still get an approximate result and the accuracy is determined 
by the truncated error of the signal. The main result is stated below. 

Theorem 2.2. Assume that the instantaneous frequency 9' , has a sparse representation, 
i.e. there exists Mq, such that 

e'(t)£V Mo = span{e i2k ^ T ,k = -M ,--- ,M }. (61) 

and the Fourier coefficients of the local mean /q and the envelope f\ in the 9-space have a 
fast decay, i.e. there exists Cq > 0, p > 4 such that 

\fo,e( k )\ < C o\k\~ p , \fi,e(k)\ < C \k\-P. (62) 
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Then, there exists rjo > 4 such that if L > r/o and t/ie intial guess satisfies 



J-((0 o -0)') 



i < ^M /2, 



(63) 



i/ien we /lave 



<T {L/A)-v +2 + l -\\F{{e m -e)') 



1 ' 



(64) 



where Tq > is a constant determined by Cq, p, Mq and f\. 

Remark 2.5. This theorem shows that our iterative method will converge to the exact 
solution up to the truncation error determined by the scale separation property. 

Proof. The proof is very similar to the proof of Theorem 12,11 The only difference is that 
the estimates of fo,e m (k), a^L and b™ m are more complicated since they are not sparse in 
the #-space. Here we only give these key estimates. 
For f 0j g m (uj), uj 7^ 0, we have 



where a = L m /L and fo,e(k) are the Fourier coefficients of /o as a funntion of 9. Note 
that the integral is when k = and uj ^ 0. Thus we exclude the case k = in the 
above summation. In the derivation of the last equality, we have used the relationship that 



As in the proof of the previous theorem, we also need to use Lemma [2. 11 In the previous 
proof, we can choose n to be any positive integer that is greater than 2. In the current 
theorem, the Fourier coefficients |/o,e| and \fi t $\ decay according to some power law. To 
obtain the desired estimates, we need to take 2 < n < p — 2. This is why we require p > 4. 




(65) 



9 = 9/L = (6 m + A9 m )/L = 9 m /L + A9 m /L = a0 + A6 m /L. 



14 



Applying Lemma |2. II to the last equality of (|65p . we have 

-i 



\foM«>)\ < E^WI 

k^O 



J27T(ak-ui)e e ikA8 m /L ^ 



\k\>^d o<|fc|<H 

1 1 2a ' 1 — 2a 



J2ir(ak-uj)e m & ikl\d m / 'L ^ 



W>fe) »<l*l<ISi 3=1 

77. ( 



k 




L 


'( 



I 2vrM 



< C n 



|w[/(2a) 



x-Pdx + C Q V M o r ' 



E i^ p+n E^ L ) J 

\o<|fc|<|^ / V =1 



I I \ /I I \ — n 



(66) 



where we have used the assumption n < p — 2, 7 < 1/4, and the fact that L > 1 is the 
number of the periods within the time interval [0, 1]. Here Co is a generic constant, Q, z 
and 7 are defined below: 



Q 



P(z,n) 



min(# )' 







(67) 



Using an argument similar to that as in the derivation of (|66h . we can get the desired 
estimates for aTL and WL as follows: 



|2^(w)| < C 



Co' 

2a 



-p+i 



+ Q 



/m(o) 



|a;|- n M^7 + C Q 



M " 7 . (68) 



Ml < C n 



\UJ\ 

2a 



-p+i 



+ Q 



/i,e(0) 



U-"M n 7 + C Q 



M « 7 . 



(69) 



The estimates (j36|) and (|37p remain valid in this case. Thus we obtain upper bounds 
for Aa m and A6 m by substituting ([68]) and ([69]) into (I55|l and (1371) . 



-n+1 



7) 



|Aa m | < r!L^ +2 + r 2 Q(aL)- 

|A6 m | < r 1 L- p+2 + r 2 g(aL)-" +1 7, 



(70) 
(71) 



where Fi is a constant depending on Co, r 2 depends on p and max |^Co, 

Moreover, by following the same argument we did in the proof of Theorem 12. II . we can 
obtain an error estimate for the instantaneous frequency, 

||j-((A0 m+1 ) , )|| 1 < T 3 E (L/4)-p+ 2 + T 4 E Q (7L/8)- n+1 \\? ((A0 m )') Hi . (72) 
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as long as 7 < 1/4 and the following conditions are satisfied 



L>2M , — - < 2min(0'), (73) 
r 1 (L/4)-P+ 2 + r 2 Qo(7L/8)-™ +1 < v^min/x, (74) 
r 3 E (L/4)-v +2 + ±T A Q E (7L/8)- n+1 < (75) 

T A Q E (7L/8)- n+1 < ~ (76) 

where T3,r4 are constants that depend on Cq,P, Mq, min f\ and . Using these four con- 
straints, we can easily derive a constant t]q > 4, such that all these conditions are satisfied 
provided that L > t]q. On the other hand, since L > 4 and n > 2, (fT6|) implies that 
r 4 Qo^o < 1/2. This proves 

||jr((A^+ 1 y)|| 1 < r £ (L/4)^+ 2 + i ||^ ((AO') Hi ■ ( 77 ) 
This completes the proof of Theorem 12.21 □ 

Remark 2.6. The constraint n < p — 2 in the above proof can be relaxed to p > 3 by using 
a more delicate calculation. 

If we further consider a more general case: the instantaneous frequency is also approxi- 
mately sparse instead of exactly sparse as we assume in Theorem 12.11 and I2.2L In this case, 
we can prove that the iterative algorithm also converges to an approximate result. However, 
we cannot apply Lemma 12. II here and need the following lemma instead. 

Lemma 2.2. Suppose cp'(t) > 0, t G [0, 1], 0(0) = 0, 0(1) = 1, and 

\$'(k)\,\ii'(k)\<c\k\~p, y\k\>M . 

Then for n < p — 1, we have 

p ( \W\W,m +cm-^ 

rl r I mine// ' n ; - , _ . , 

/ e^e-^d<p < V Un (min6/)n J A® XfaiUb)-' (ll#lll.M + CM/ +1 ) 



\oj\ n (min0 / )™ 



provided that e 1 ^ e l27TU "t > is a periodic function. Here \\ip'\\i t M = Yl\k\<M l^'WI andP(x,n) 
is the same (n — l)th order polynomial as in Lemma \2.1\ 

Proof. The proof is similar to the proof of Lemma 12.11 The only difference is that we need 
the following estimate instead of (1251) . 

max|V (n) (t)| < ^|(2^) n - 1 ^'(A;)| < (2vrM ) n - 1 |^>(fc)| + (2tt)"- 1 C ^ \k\~ p ^ n ~ 1 

k \k\<M \k\>M 

< (2nM ) n ~ 1 (||^||i,Mo + CM/+ 1 ) . (78) 

□ 
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Using this lemma and following an argument similar to that as in the previous two 
theorems, we can prove the following theorem: 

Theorem 2.3. Assume that the Fourier coefficients of the instantaneous frequency 9', the 
local mean /o and the envelope f\ all have fast decay, i.e. there exists Cq > 0, p > 4 such 
that 

\H0')(k)\ < C \k\-P, \Mfo)(k)\ < C \k\-r, \Mfi)(k)\ < C \k\- p . (79) 
If L is large enough and the intial guess satisfies 

\\^({0°-0)') ||i <vrM /2, (80) 

then, we have 

|| j- _ h < r (L/4)-P+ 2 + \c M-v +l + \ || J" {{d m - 6)') \l , (81) 

where Fq > is a constant determined by Co, M$ and f\. 

Remark 2.7. In the analysis presented in this section, we have assumed that the Fourier 
transform in the 6 m -space, J 7 gim(-), is exact. In real computations, we need to first interpolate 
the signal from a uniform grid in the physical space to a uniform grid in the m -space, then 
apply the Fast Fourier transform. This interpolation process would introduce some error. 
However, the interpolation error should be very small since we assume that the signal is 
well resolved by the sample points. 

3 Periodic signal with sparse samples 

In this section, we will consider a more challenging case, the sample points tj, j = 1, • • • , N 
are too few to resolve the signal. In this case, the algorithm presented in the last section 
does not apply directly. The reason is that the Fourier transform in the # m -space, J 7 em(-), 
cannot be computed accurately by the interpolation-FFT method. One way to obtain the 
the Fourier transform in the # m -space is to solve a linear system. Such method is very 
expensive. Moreover, the resulting linear system is under-determined since we do not have 
sufficient number of sample points. 

Thanks to the recent development of compressive sensing, we know that if the Fourier 
coefficients are sparse, then I 1 minimization would give an approximate solution from very 
few sample points. Hence, we can use a I 1 minimization problem to generate the Fourier 
coefficients in the 6* m -space in each step: 

• 0° = e , m = 0. 
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Step 1: Solve the l\ minimization problem to get the Fourier transform of the signal 
r m in the # m -coordinate: 

fem = arg min ||x||i, subject to Ag™ ■ x = f (82) 

x£R N b 

where Agm G M. NsXNb , N s < Nj,, N s is the number of samples and N b is the number 
of Fourier modes. A m(j, k) = e i2 ^ m fe), j = 1, . . . ; N s , k = -N b /2 + 1, • • • , N b /2 

, -ffrn _ 6> m -6> m (0) 
anQ u (9 m (T)— m (O) ' 

Step 2: Apply a cutoff function to the Fourier Transform of r^ -1 to compute a 



and 6 m+1 : 



m+l _ tt-1 
im+l _ t— 1 

-1 • 



f/flm (OJ + Lflm) + / r (w - Lgmfj ■ X (uJ / Lgm) , (83) 
-% ■ (jem (uj + Lg m ) -Jgra{oj- Lgmj^j ■ x (u/Lgm) , (84) 



where J- gm is the inverse Fourier transform defined in the # m -coordinate: 

J# (/*») = E /^K 2 ™^, j = 1, • • • , N 8 , (85) 

uj=-N b /2+l 



and x is the cutoff function, 



, s J 1, -1/2 < w < 1/2, 
X( w ) = S . (86) 

I 0, otherwise. 

• Step 3: Update 9 m in the i-coordinate: 

M ' = Pvm o (j t ( arctan (^Tl) ) ) ' A9 ^ = [ A6 '( s ) ds > ° m+1 = 0m + 
where j3 € [0, 1] is chosen to make sure that 9 m+ is monotonically increasing: 

/3 = max |a G [0, 1] : (6 m + aA9) > o| , (87) 

and -fV M is the projection operator to the space Vm = span | e * 2fc7r */ T ; = —Mo, • • • , 0, 
and Mq is chosen a priori. 

• Step 4: If — &k\\2 < eo, stop. Otherwise, set n = n + 1 and go to Step 1. 

Suppose the sample points tj, j = , N s are selected at random from a set of 

uniform grid l/Nf, I = 0, • • • , Nf — 1, then the optimization problem (|82|) in Step 1 can be 
rewritten in the following form: 

min ||x||i, subject to <&gm ■ x = f, (88) 
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where / = y v N J f and <&gm is obtained by selecting N s rows from an Nf by Nj, matrix 

U 0m which is defined as U 0m (j,k) = J ■ e^^™^, j = ,N f , k = -N b /2 + 

1, • • • , JVfo/2. As we will show later, the columns of Ugm are approximately orthogonal to 
each other. This property will play an important role in our convergence and stability 
analysis. 

We remark that our problem is more challenging than the compressive sensing prob- 
lem in the sense that we need not only to find the sparsest representation but also a basis 
parametrized by a phase function 9 over which the signal has the sparsest representation. 
To overcome this difficulty we propose an iterative algorithm to solve this nonlinear opti- 
mization problem. 

3.1 Exact recovery 

Theorem 3.1. Under the same assumption as in Theorem \2.1\ there exist r]Q > 0, rji > 0, 
such that 

l|jr ^ e m+i _ e y^ h < 1 1| j- _ e)') \\ x , (89) 

provided that L > rjo and S > r)\, where S be the largest number such that 63s (&e m ) + 
354s(&e m ) < 2. Here 63(A) is the S-restricted isometry constant of matrix A given in J3J/, 
which is the smallest number such that 

(l-5s)\\c\\l<\\A T c\\l<(l + 5 s )\\c\\l, 

for all subsets T with \T\ < S and coefficients sequences (cj)j^T- 

To prove this theorem, we need to use the following theorem of Candes, Romberg, and 
Tao 0. 

Theorem 3.2. Let S be such that £35(^4) + 3545(^4) < 2, where A G R nxm , n <m. Suppose 
that xq is an arbitrary vector in M m and let xo.s 1 be the truncated vector corresponding to 
the S largest values of xo. Then the solution x* to the l\ minimization problem 

min||x||i, subject to Ax = f (90) 

satisfies 

\\x* - Xq\\i < C 2 ,s ■ \\%o - x ,s\\i- (91) 
Now we present the proof of Theorem 13.11 
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Proof, of Theorem \3.1[ Using (|137p and (|138p in Appendix B, we have 



|Aa m | < 2 V f 0) g m (k) + Y. 



<m{k)\ + b^ m {k) 



±L m <k<%L n 



Y \agtn(k)\+2 Y 



V< fc <l ir - 



/em (A;) - fgm(k) 



< T Q{aL)- n+1 1 + C 2 



S ■ 1/0" 



s\\i, 



(92) 



where To is a constant depending on Mo,M\,n and fgmg is the truncated vector corre- 
sponding to the 5 largest values of fgm . 

Without loss of generality, we assume that L m > 5/3, and define f 0m $ to be 



fem(k), k G [-L m - 5/6, -L m + 5/6] U [-5/6, 5/6] U [L m - 5/6, L m + 5/6], 



0. 



otherwise. 



Then by the definition of fgm g and fg m s , we have 

Wfe™ — fe m ,s\\i < Wfe™ — fe m ,s\\i 

Y \fo-(k)\+ £ \fe^k)\ 

S/6<\k\<L™-S/6 |fc|>L m +5/6 

< Y \foM k )\+ Yl \ao™(k)\+ Y &8 m ( k )\ 

\k\>S/6 \k\>S/6 \k\>S/6 

< riQ5-" +1 7 . 

Substituting ([93]) into ([92j), we get 

|Aa m | < (r (aL)- n+1 + C^sTiSr^ 1 ) Q 7 . 

Similarly, we obtain 

|A6 m | < (r (aL)-" +1 + C 2 , S TiS- n+l ) Qy. 



(93) 



(94) 



(95) 



Using these two key estimates and follow the same argument as that in the proof of Theorem 
12.11 we can complete the proof of Theorem 13.11 □ 

Remark 3.1. The above result on the exact recovery of signals with sparse samples can be 
generalized to the case that we consider in Theorem \2.2\ by combining the argument of the 
above theorem with the idea presented in the proof of Theorem \2.2\ In this case, we can 
recover the signal with an error which is determined by L, S and the decay rates of fo t g, fi t g 
and 9'. 
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In Theorem 13. 1\ we assume that in each step, the condition 83s (&e m ) + ^is(^e m ) < 2 
is satisfied. Using the definition of 5s, it is easy to see that 63s < ^4S- Thus, a sufficient 
condition to satisfy 63s ($e m ) + Sdisi^o™) < 2 is to require 64s (^e m ) < 1/2. 

In compressive sensing, there is a well-known result by Candes and Tao in [3]. This 
result states that if the matrix £ jjMxiv j g obtained by selecting M rows at random from 
an N x iV Fourier matrix C7 where fX^fc = -A^ e l2 ^i k / N ^ j^fc = l r ■ ■ N, then the condition 
<5s(<3?) < 1/2 is satisfied with an overwhelming probability provided that 

M 

S<C——;, (96) 
(log N )° 

where C is a constant. 

In our formulation (see (|88p ). the matrix <&gm also consists of N s rows of a Nf-by-Nb 
matrix Ugm. The main difference is that the matrix Ugm is not a standard Fourier matrix. 
Instead it is a Fourier matrix in the # m -space which makes it non-orthonormal. As a result, 
we cannot apply the result of Candes and Tao in [3] directly. Fortunately, we have the 
following result by slightly modifying the arguments used in [3] which can be applied to 
matrix Ugm. 

Theorem 3.3. If uq = max^j \UqUq — I)k,j\ < , where Ug is the conjugate transpose 
ofUg, the condition 63(^9) < 1/2 holds with probability 1 — 5 provided that 

N S >C- max(0)' (5 log 2 N b - log 5) log 4 N b , (97) 

where N s is the number of the samples, Nf, is the number of elements in the basis. 

This theorem shows that if the columns of Ugm are approximately orthogonal to each 
other, it has a property similar to the standard Fourier matrix. Consequently, we need only 
to estimate the mutual coherence of the columns of the matrix Ugm for 9 m £ Vm ■ 

Lemma 3.1. Let <f)'{t) £ V Mo , t £ [0,1] and 0(0) = 0, 0(1) = 1, <j)' > 0, tj = j/L, j = 
0, -,L — 1 is a uniform grid over [0, 1], then for any n £ N, there exists C(n) > 0, such that 

lf>(*,K M *<''> < CW-b { (Mfli)", (M*)"} . (98) 

The proof of this lemma is deferred to Appendix C. 

Using this lemma, we can show that the condition isq = max^j \ Ug m Ugm — I)k,j\ < TBTv^ 
is satisfied as long as Nf > C||J 7 ((^ m ) / )||iA' r fc where C is a constant determined by JVj,. This 
leads to the following theorem. 

Theorem 3.4. Suppose the sample points tj, j = 1, • • • ,N S are selected at random from a 
set of uniform grid l/Nf, I = 0, • • • , Nf — 1. If 
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in (m + l)st step, we have 5s(&9 m ) < 1/2 holds with probability 1 — 5 provided that 

N S >C- max[(T)'] (S log 2 N b - log 5) log 4 iV 6 , (99) 

where N s is the number of the samples, N b is the number of elements in the basis. 

The above result shows that if the sample points are selected at random, in each step, 
with probability 1 — 5, we can get the right answer. This does not mean that the whole 
iteration converges to the right solution with an overwhelming probability. If the iteration 
is run up to the nth step, the probability that all these n steps are successful is 1 — n5. If 
n is large, the probability could be small even if 5 is very small. 

3.2 Uniform estimate of 8s(&e m ) during the iteration 

In order to make sure that the iterative algorithm would converge with a high probability, 
we have to obtain an uniform estimate of 5s($e m ) during the iteration. More precisely, we 
need to prove that with an overwhelming probability, 

sup 8 s ($e)<l/2, (100) 

where W Mo = {0 G C°°[0, 1] : 0(0) = 0, 0(1) = 1, 0' G V Mo , <f/(t) > 0, Vt G [0, 1]}. 

The analysis below shows that this is true even if the number of sample points is in 
the same order as that required by Theorem 13.41 There are two key observations in this 
analysis. The first one is that the difference between 5s($q) and 5s ($^) would be small if 
9,4> 6 Wm an d \\0 — 0||oo is small. Actually, we can make \6s(&q) — 5s (3*^)1 < \ as long 
as \\0' — 4>\\oo < T = 0(N b 5//2 M " 1 ). The second observation is that Wm is bounded and 
finite dimensional which implies compactness. Then for any r > 0, there exist a finite subset 
A r C Wm , such that for any 9 £ Wm , there exists G ^4 r , such that \\9 — (j)j\\oo < r. 

Based on these two observations, we can show that 

sup 5 s ($e) < sup 5s(^) + 1/4. (101) 

eew M() <t>eA r 

Then by the union bound, we have 

P ( sup 5s($e) > 1/2 ] < P \ sup 5s(^) > 1/4 ) < \A r \ sup P{5 s {<$><t>) > 1/4) . (102) 

\6eW Mo J \4>&A r J 4>&A r 

It is sufficient to prove that 

P(Ss(<S>4>) > 1/4) < 5/| A|, V0 G A r c H^o, (103) 
which is true as long as 

N s > C-maxWWoo (Slog 2 N b + log \A r \ - log 5) log 4 N b . (104) 

6»eA r 

Now, we need only to choose a proper r and estimate the corresponding \A r \. 
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Lemma 3.2. Let W = G C°°[0,1] : 0(0) = 0,0(1) = 1,0' € V Mo , <£'(*) > 0, Vt G [0,1]}. 
For any r > 0, one can /ind a finite subset A r of W with cardinality 

, „ , /16ttM 2 \ 2Mo , , 

\A\ < f — ^ + > ( 105 ) 

snc/i i/iat /or a// ^ G ^> i/iere exists (f) £ A r such that \\ip' — 4>'\\oo < r and [|"0 — 0[|oc 5; r - 

Proof. Let W = {4>' : <f) £ W}. Then for all ^ £ W", we have the following Fourier 
representation 

^(t) = 1 + ^( Cj cos(2vrji) + dj sin(2vrjt)) > 0, Vi G [0, 1]. (106) 
i=i 

Since J Q * ^(s)ds G according to the definition of W, then Jg 1 ip(s)ds = 1, so the constant 
in the above Fourier representation is 1. 

By multiplying l+cos(2njt) to both sides of (|106p and integrating over [0, 1] with respect 
to t, we get 

1 + Cj /2 > 0, 

which implies that Cj > —2, where we have used the fact that 1 + cos(2ir jt) > 0. 

On the other hand, multiplying — 1 + cos(2ir jt) to both sides of (I106p and taking integral 
over [0, 1] with respect to t, we have Cj < 2. Combining these two results, we have 

\cj\ < 2. (107) 

Similarly, by multiplying sin(27rjt) db 1 to both sides of (|106p and taking integral over 
[0, 1] with respect to t, we obtain 

\dj\<2. (108) 

Now, we have proven that for any function in W, its Fourier coefficients are bounded 
by 2. 

Let h = r/(2M ), L r = \4/h~\ , Z r = {-2, -2 + h, -2 + 2h, ■ ■ ■ , -2 + (L r - l)h}. 
For any tp G W, we know that its Fourier coefficients Cj,dj G [—2,2], j = 1, ■ • ■ ,Mq, 
then one can find dj , 6j G Z r correspondingly such that 

\a 3 - Cj \ < h/2 = r/(4M ), j = 1, • • • , M , 
|fy - < /i/2 = r/(4M ), j = 1, • • • , M , 

which implies that there exists y £ Y r such that 

M 

[|^ - y[|oo < 5^(l a J ~ c il + l 6 i - d i\) ^ 2vrM 2 /i = r/2 (109) 
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where Y r is defined as follows 

M 

Y r = {y = Y^( aj cos(2vrjt) + bj sin(2vrjt)) : a s ,b s G Z r , B r/2 (y) nF/l}, 

3=1 

and B r/2 {y) = {z G V Mo ■ \\z ~ y\\oo < r/2}. 
By the definition of Y r , one can get 



Y r \ < \Z r \ 2M ° = L 2M ° < f + l) . (110) 

Suppose Y r = {yi,y2, • • • , 2/iy r i}) by the definition of y r , for each j/j, there exists cf>j G 
such that (f>j G -B r / 2 (y) • We can get a finite subset ^4 r of W by collecting all these <j>j together 
and obviously \A r \ = \ Y r \. 

Finally, let 

A r = ^Jj)(s)ds : 0€3rJ. (Ill) 

Then, for any ip £ W, there exists <f>j G A r and y,- G YV, such that 

11^ ~<t>'j Hoc < llV'-yilloo + lly.-^lloo <r/2 + r/2 = r. (112) 
Moreover, we have 

U - <t>j\\°c < J W(s) - #(s)|ds < r, (113) 
where we have used the fact that ^(0) = <ftj(0) = to eliminate the integral constant. □ 



Remark 3.2. By multiplying Cj cos {2-Kjt) + dj sin(2yrjt) ± Jc 2 - + d] to both sides of HWj) 
and taking integral over [0, 1] with respect to t, we have 

c 2 + d 2 <4, j = l,---,M . (114) 

This implies a sharper estimate of \A r \, 

, A , /8vrM n 2 \ Mo , , 

\A\ < i^—^j . (115) 

Also, hll4\ ) gives us a bound for ||0'||oo in Wm , 

sup Halloo < 4M + 1. (116) 

which will be used later. 
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It remains to choose a proper r. First, we show that the difference of 5s between two 
matrices can be controlled by the difference of each element. 

Proposition 3.1. Let A,B are two M by N matrices, M < N and the columns of A are 
normalized to be unit vectors in I 2 norm. Then, for any S £ N, we have 

\5 s (A)-5 s (B)\<(2eVM + e 2 M)S, (117) 

where e = maxjj \Aij — Bij\. 

Proof. By the definition of 5,s, we need only to prove that for all subsets T with \T\ < S 
and coefficients sequences (cj)j £ T, 

| ||A T c||l - ||5tc||1| < (2eVM + e 2 M)S\\c\\ 2 2 . (118) 
This can be verified by a direct calculation: 

|Ptc||I-||£tc||I| = I ^2 CiCjiAfAj - BfBj)\ 

= I c i c j{Dj Aj + AjDj + DjDj)\ 

< max \DfAj + AJDj + D?Dj\ £ \c iCj \ 

< |T|||c|||max(||A||2Pj||2 + PilM-Djlk + || AlbPjlb) 

< (2ev / Mmaxp i || 2 + e 2 M)S\\c\\ 2 2 . (119) 

In the above derivation, D = B — A, Ai,Aj are ith and jth columns of A. □ 

Using the above proposition, we obtain the following result: 
Corollary 3.1. Let 6, eW, then 

\Ss(^)-Ss(^)\<l, (120) 

— ' — 'i —9 — 1/2 

provided that \9 — 4>\ < CN b M Q , where C is an absolute constant. 

Proof. We need only to show that the difference between <f>g and $^ can be controlled by 
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I . This is quite straightforward using the definition of and <3?^: 



\^U,k)-^(j,k)\ 



< 



< 



< 



3 j27rfc(9(t i )- ( /)(t : ,) 



-2<rrk\e(tj) - <j>(tj)\ 



| 27rjV b e % /4M + 1 



(121) 



where we have used the estimate \\6 < 4Mo + 1 given in (11160 . Using Proposition 13.11 
and the fact that S < N b , we can complete the proof. □ 



Combining Lemma 13.21 Corollary 13.11 and (11040 . we have the following theorem, 

Theorem 3.5. supg &WM ^ b~s(&e) < 1/2 holds with probability 1 — 5 provided that 

N S >C- (4M + 1) (S log 2 N b + M log N b - log 5) log 4 N b , (122) 

where N s is the number of the samples, N b is the number of elements in the basis. 

Remark 3.3. Comparing with the condition stated in Theorem \3.4\ we require extra Mo log 5 N b 
samples in order to get the uniform estimate. But this number Mq log 5 N b can be absorbed 
by /Slog 6 N bl since S is larger than Mq. Thus the condition to get an uniform estimate is 
essentially the same as that in Theorem \3-4\ 



4 Numerical results 

In this section, we will perform several numerical experiments to confirm our theoretical re- 
sults presented in the previous section and to demonstrate the performance of the algorithm 
based on the weighted I 1 optimization. 

Example 1: Exact recovery for a well-resolved signal 

The first example is a well-resolved periodic signal. In this example, the mean and the 
envelope have a sparse Fourier representation in the 0-space and the instantaneous frequency 
has a sparse Fourier spectrum in the physical space. The signal we use is generated by the 
following formula: 

6 = 20vrf + 2cos2vrt + 2sin4vrt, 6 = 6/10 

ao = 2 + cos 6 + 2 sin 26 + cos 36, a\ = 3 + cos 6 + sin 36* 

/ = cio + ai cos 6. (123) 
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Figure 1: Left: Original signal; Right: Error of the IMF and the phase function. 

This signal is sampled over a uniform mesh of 256 points such that there are about 12 
samples in each period of the signal on average. 

The numerical results are shown in Fig. Q] and Fig. [2j In Fig. [H we can see that our 
algorithm indeed recovers the exact decomposition of this signal. This is also consistent with 
the theoretical result we obtained in Theorem 12.11 The result shown in Fig. [I] is obtained 
by applying the non-uniform Fourier transform directly. As we proposed in our algorithm, 
for a well-resolved signal, it is more efficient to use a combination of interpolation and 
FFT. This procedure would introduce some interpolation error, however the computation 
is accelerated tremendously. As we see in Fig. [2j if we use the FFT-based algorithm, the 
error increase to the order of 10 -4 instead of 10 -11 in the previous result when we used the 
non-uniform Fourier transform. If we increase the number of sample points to 1024, the 
order of error decreases to W~ 7 . This indicates that the main source of error comes from 
the interpolation error. 

In our previous paper |12j . we have shown many numerical results to demonstrate the 
stability of our algorithm. These numerical examples confirm the theoretical results pre- 
sented in Theorem 12.21 and Theorem 12.31 We will not reproduce these numerical examples 
in this paper. 

Example 2: Exact recovery for a signal with random samples 

The second example is designed to confirm the result of Theorem 13.11 This example shows 
that for a signal with a sparse structure, our algorithm is capable of producing the exact 
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Figure 2: Left: Error of the IMF and the phase function with 256 uniform samples; Right: 
Error of the IMF and the phase function with 1024 uniform samples. 

decomposition even if it is poorly sampled. The signal is given below in (jl24|) . 

9 = 200vrt - 10cos2vrt - 2sin4vrt, 9 = 9/(100) 
ao = cos 9, a± = 3 + cos 9 + sin 29 

f = a + aicos9. (124) 

The number of sample points is set to be 120. These sample points are selected at random 
over 4096 uniformly distributed points. On average, there are only 1.2 points in each period 
of the signal. We test 100 independent samples and our algorithm is able to recover the 
signal for 97 samples, which gives 97% success rate. Fig. [3] gives one of the successful 
samples. 

The right panel of Fig. [3] shows that the order of error is 10" 2 for IMF and 10" 3 for the 
phase function. In the computation, the I 1 optimization problem is solved approximately 
in each step of the iteration. This is the reason that the error is much larger than the 
round-off error of the computer. If we increase the accuracy in solving the I 1 optimization 
problem, the algorithm would give a more accurate result. However the computational cost 
also increases as a consequence. We also reduce the number of sample points to 80 and 
carry out the same test for 100 times. In this case, the recovery rate was 46 out of 100. 

Example 3: Approximate recovery for a signal with random samples 

In this example, we will check the stability of our algorithm for a sparsely sampled signal. 
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Figure 3: Left: Original signal and the sample points; Right: Error of the IMF and phase 
function. 

The signal is generated by (|125|) . 

= O + 0.1sin(120vrt), 

ao = cos(27rt), a± = 3 + cos(2-7rt) + sin(47rt) 

f = a + ai cose + 0.1X(t). (125) 

where 9q is the 9 given in (I124p . and X{t) is the Gaussian noise with standard deviation 
a 1 = 1. Based on the signal in the previous example, we add one small high frequency 
component on the phase function such that this high frequency part cannot be captured 
during the iteration. Moreover, ao and a% are not exactly sparse over the Fourier basis in the 
0-space. We also add a white noise to the original signal to make it even more challenging 
to decompose. 

In this example, when the number of sample points is 120, our method can give 92 
successful recoveries in 100 independent tests. Fig. |4] gives one of the successful recoveries 
obtained by our algorithm. Due to the truncation error and the noise, the error becomes 
much larger than that in the previous example. But all the errors are comparable with 
the magnitude of the truncation error and noise, which shows that our method has good 
stability even for signals with rare samples. When the number of samples is reduced to 80, 
the recovery rate drops to 40 out of 100. 
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Time Time 

Figure 4: Left: Original signal (blue) and the sample points (red) in Ex 3; Right: Errors of 
ao, cl\ and 9. 

5 Concluding remarks 

In this paper, we analysed the convergence of the data-driven time- frequency analysis 
method proposed in |12| . First, we considered the case when the number of sample points 
is large enough. We proved that the algorithm we developed would converge to the exact 
decomposition if the signal has an intrinsic sparsity structure in the coordinate determined 
by the phase function. We also proved the convergence of our method with an approximate 
decomposition when the signal does not have an exact sparse structure but its spectral 
coefficients have a fast decay. 

We also considered the more challenging case when only a few number of samples are 
given which do not resolve the original signal accurately. In this case, we need to solve a 
I 1 minimization problem which is computationally more expensive. We proved the stability 
and convergence of our method by using some results developed in compressive sensing. As 
in compressive sensing, the convergence and stability of our method assumes that certain 
^-restricted isometry condition is satisfied. We proved that for each fixed step in the 
iteration, this /S-restricted isometry condition is satisfied with an overwhelming probability 
if the sample points are selected at random. 

We presented numerical evidence to support our theoretical results. Our numerical 
results confirmed the theoretical results in all cases that we considered. 

We are currently working on the convergence of the data-driven time-frequency analysis 
method for non-periodic signals. Our extensive numerical results seem to indicate that our 
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method also converges for non-periodic signals. The theoretical analysis for this problem is 
more challenging. We will report the result in a subsequent paper. 
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Appendix A: Error of the envelope functions 

Suppose 



f(t) = Mt) + fi(t) cose 

is the signal we want to decompose. 

Let a m = h cos A9 m , b m = /i sin A6> m , then, we have 

/ = / + a m cos6> m -& m sinr\ 

Let L m = em(T) 2 ~ em(0) and Q m = 9 m /{2^L m ). Then, we have 



(126) 



f = f + a m cos 2ttL t 
Define the Fourier transform in #-space as: 



So 



f(t)e 



b m sm2irL m 



-i2nkW Aff"- 



(127) 



(128) 



(129) 



Applying Fourier transform to both sides of (|128p . we have 

/*»(*) = SoM k ) + \ + Lm ) + ^( k + Lm ) ~^(k ~ L m )) • (130) 

Then, we get 

a^ m {k) - ib^ m {k) = 2f em (k - L m ) - 2f , em (k - L m ) - af m {k - 2L m ) - ib^ m {k - 2L m ), 
Z^ m (k) + ibgL(k) = 2f Bm {k + L m ) - 2fo i e m (k + L m ) - c$n(k + 2L m ) + ibgL(k + 2L m ). 

It is easy to solve for aVL and WL to obtain: 



- 2L rn )) - - [bf m {k + 2L m ) - b^ m {k - 2L n 
L m )) + i \foM k + Lm ) ~ SoM k ~ Lm ) 



a$n(k) = f 6 m{k + L m ) + f em {k - L m ) - [foMk + L m ) + SoM k ~ U 
+ ^(a^m(k + 2L m ) + al 

bf m (k) = -i (Jgm (k + L m ) -fgm( 

+\ (fi$»(k + 2L m ) -a%L(k- 2L m )) - % - ( V-Utt- + 21"' ) + 1ft,, (/,- - 21"' ) ; 



,(131) 



.(132) 
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In our algorithm, J r em(a m ) and J-Qm{b m ) are approximated in the following way: 
agm(k) = 



fg m (k + L rn ) + f Q m{k - L m ), -L m /2 <k< L m /2, 
0, otherwise. 



bgm (k) 



-i{f Q m{k + L m ) - f B m(k - L m )), -L m /2 <k< L m /2, 
0, otherwise. 



(133) 
(134) 



Then, we can get the error of the approximation in the spectral space: 

' - fo,eA k + Lm ) + foM k ~L m ) + \ (a$n(k + 2L m ) + a™ (k - 2L m )) 
Aa 0m (k) = { -i (b™ m {k + 2L m ) - b^ m (k - 2L m ) 



< L m /2, 
> L m /2. 



fo,pn(k + L m ) - f 0>em (k - L m ) + i (5^ m {k + 2L m ) - cff m (k - 2L m )) 



Ab e m(k) 



' 'b™ m {k + 2L m ) + b" d 



2U 



\k\ < L m /2, 

bf m (k), \k\>L m /2. 
Thus, we have the following inequality for the I 1 norm of the error in the spectral space: 



|Aa m | < ||Aa fl m||i 

< 2 h^( k ) + E 



V< fc <l Lri 



|L" l <fc<|L" 



j m {k)\+b% m {k))+ e lo^wi- ( 135 ) 



Similarly, we get 

|A6 m |<2 £ \foM k )\+ E (l^(*)l + |^( fc )|) + E ( 136 ) 

^<fc<§L™ §L™<fc<§L'™ \k\>^ 

In the above derivation, we assume that the Fourier transform of / in # m -space can be 
calculated exactly. If only approximate Fourier transform is available, denoted as fg m , such 
as the signal with sparse samples we discussed in Section [3l there would be an extra term 
in the estimates of Aa m and Ab rn , 

\Aa m \ < 2 Y + E (\a%L(k)\+b%L(k) 



§L m <fc<§L" 



E e 



ifei>- 



V< fe <f ir ' 



(137) 



A6 m | < 2 E ToM k )\+ E 



^■<fc<fx« 



f L m <fc<f L" 



E few +2 E 



/ 9 m(fc) - / 0m (A;) 



(138) 
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Appendix B: Estimates of /o,e m (^)? a^L(uj) and 6^„(w)in Theorem 12.11 

We first estimate /q. We have 



l/o,<? m MI 



fo(t)e 



-i2TTUje ( n 



E /o, e (fc)e i2 " ( " ^ 

|fc|<Mi 

E /o,9(ife) C i^-^dT 

\k\<Mi 
|fe|<Mi 



(139) 



where a = L m /L. In the last equality, we have used the fact that 9 = 2ttL9, 6 m = 27rL m 6 ' 
and 9 = 9 m + A9 m . 

Using Lemma \2.1\ we obtain for any \u\ > L/2 that 



|/o,0 m MI < 



E he(k) I e i2 ^ ak -^ e e lkAem / L dr 

\k\<Ah 







< c V qm q y 



(2^M o )^||^ m [(A0 m )']||{ 



where 



< 2C Q 



-P(z,n) 
min(^ m ) / 



M n Mx E^itW, 



(140) 



l|j-[(0']lli 



mm 



2vrM n 



(141) 



In the above derivation, we need to assume that L > AM\ such that \u — ak\ > \uj\/2 for 
all \u\ > L/2 and | A? | < Mi. 

If we further assume that 7 < 1/4, we have 



\foM^)\ < C Q( y ) MqMij. 



(142) 



Next, we estimate aS™. The method of analysis is similar to the previous one, however 
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the derivation is a little more complicated. We proceed as follows: 



\af m (u})\ 



1 

< - 
~ 2 



1 

< - 
~ 2 



h(t)cosM m {t)e- i2 ™ e d6 m 



|fc|<A/i 
■\<M 1 Jo 



|fc|<Mi ^° 



For the first term in the above inequality, we have that for any |u;| > L/2, 



\k\<Mi 



< 



C Q J2 



M « 



E 



,w — ak\ n 
\k\<Mi 1 1 3=1 

-n n 



1 + I 



3=1 



|fe|<Mi 



< 4C Q ( Y ) M n (2Mi + 1) 7 . 



(143) 



(144) 



Here we also assume that L > 4Mi, 7 < 1/4. The definition of Q and 7 can be found in 

EU). 

For the second term in (]143p . we can get the same bound for \cj\ > L/2, 



\k\<Mi 



< 4C Q ( y ) M n (2M! + 1)7. (145) 



By combining (jl43j) . (jl44|) and (|145j) . we obtain a complete control of a, 



|agS.(w)| < 4C Q (^J M "(2M 1 + 1) 7 , V|w| > L/2. 
Similarly, we can estimate b™ m by the same upper bound, 

< 4C Q (" Jy) M n (2M! + 1) 7 , V|w| > L/2. 



(146) 



(147) 
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Appendix C: Proof of Lemma 13.11 

Proof. Since e i27rfc ^ is a periodic function over [0, 1], it can be represented by Fourier series: 

i2nk<f>(t) 



+oo 



£ d ie ^ lt , te[0,l), 



(148) 



l=— oo 



where di = e l27rfe <K*) e t2nlt dt. By assumption, we have <f>'(t) G Vm . Thus, we get 



M 



<f>'(t) = c i e ' 27ri *> ^[0,1], 

j=-M 



in 

Then, we have 



where Cj = 6 {t)e- i2 ^ l dt. 



L-l 



J2nkcf>(t m ) 



\ E 

L-l M +00 

r E E E c^e^*" 



,i27r(Z+j)m/L 



m=0j=— Mo i=— oo 
. Mo +oo L-l 

i E E c ^ E - 

j=-M l=-oo m=0 
M _^ 

E E C 3^V L -Q 

j=-M op ez 

M M 

E c i d -i+ E E c iV-i 

j=-M j=-Af peZ,p^0 
r l M 



j=-A/ pez, P ^o 



M 

= E E 

j=-M pez,p^o 

Using integration by parts, we have 

-i 



dt\ 







< 



< — — max 

~ \l\ n t 



d" 



dt n J 



dt 



d n i2nkct>\ 



dt' 



(149) 



(150) 



(151) 
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Using the inequality (|25p in the proof of Lemma 12. 1\ and by a direct calculation, we can 
show that for any n > 0, there exists C(n) > 0, such that 



max 

t 



J2nk 



dt n 



< C(n)J2\k\^- j U'\\i =C(n)\k\M^ 1 \\$'\ 



|fe|«l 



H^Hi - 1 



< 



2C(n)|Af||^||?, Jfi||0 > ||i>2 ; 
2C(n)(2M )«, fi||^||i<2. 



As a result, we obtain 



Mi 

M 



(152) 



Idil < 



2C(n) 



IfeHl^lli >2M , 



2C(n)|^a| n , |fc|||0>|| 1 <2Mo, 
Finally, we derive the following estimate 



M 

j=-M pez,p^o 
M 

< Yl Yl \ c i\\ d pL-j\ 

pez,p^oj=-M 

Mo +oo 

< 2 \cj\y y max\d pL - 

j=-M p=l 3 
+00 

< 4C(n)||0 / ||i^max 

P =i 





n 


2M 




pL - M 


j 


pL - M 







kU'Wl 


n 


2M 




( 


L 


j 


L 


1 



< iC(n)\\(f/\\i max 



< 4(l-M /L)-" +1 -^|||^|| 1 max 
n — 1 



5>-M /L)- 
p=i 





n 


2M 




L 


5 


L 





(153) 



(154) 
□ 
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